[MoE][PyTorch] Add mask-based MoE permutation #1373

hxbai · 2024-12-13T04:49:02Z

Description

Add mask-based token permutation and local chunk permutation fused kernels. These kernels are implemented with OpenAI Triton.

Related commit in Megatron-LM NVIDIA/Megatron-LM@ac0474d

Fixes # (issue)

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refractor

Changes

Please list the changes introduced in this PR:

Non-breaking API changes in te.pytorch.permutation.moe_permute and te.pytorch.permutation.moe_unpermute
Add new APIs of te.pytorch.permutation.moe_sort_chunks_by_indices

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

transformer_engine/pytorch/permutation.py

Signed-off-by: Hongxiao Bai <[email protected]>

for more information, see https://pre-commit.ci Signed-off-by: Hongxiao Bai <[email protected]>

Signed-off-by: Hongxiao Bai <[email protected]>

transformer_engine/pytorch/permutation.py

tests/pytorch/test_permutation.py

transformer_engine/pytorch/triton/permutation.py

Signed-off-by: Hongxiao Bai <[email protected]>

yanring · 2025-01-21T02:33:48Z

Hi @timmoon10 @phu0ngng, could you help take another look at this? We intend to incorporate this optimization into mcore v0.11 (6th Feb). Thanks a lot!

phu0ngng · 2025-01-21T16:46:05Z

/te-ci pytorch

timmoon10

Overall LGTM. My suggestions are stylistic.

tests/pytorch/test_permutation.py

Signed-off-by: Hongxiao Bai <[email protected]>

* add mask-based moe permutation * change moe_chunk_permute to moe_sort_chunks_by_indices * fix __all__ in pytorch/permutation.py * fix func/var names and typos; update tols in UT --------- Signed-off-by: Hongxiao Bai <[email protected]> Co-authored-by: Phuong Nguyen <[email protected]> Co-authored-by: Tim Moon <[email protected]> Signed-off-by: Youngeun Kwon <[email protected]>

hxbai changed the title ~~[MoE][Common/PyTorch] Add mask-based MoE permutation~~ [MoE][PyTorch] Add mask-based MoE permutation Dec 13, 2024

yaox12 reviewed Dec 13, 2024

View reviewed changes

transformer_engine/pytorch/permutation.py Show resolved Hide resolved

hxbai and others added 4 commits December 13, 2024 06:05

add mask based moe permutation

7e04f9a

Signed-off-by: Hongxiao Bai <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

664af70

for more information, see https://pre-commit.ci Signed-off-by: Hongxiao Bai <[email protected]>

change moe_chunk_permute to moe_sort_chunks_by_indices

a8f1daa

Signed-off-by: Hongxiao Bai <[email protected]>

fix __all__ in pytorch/permutation.py

ca94d72

Signed-off-by: Hongxiao Bai <[email protected]>

hxbai force-pushed the permute_fusion branch from 6160104 to ca94d72 Compare December 13, 2024 06:05

phu0ngng self-requested a review January 8, 2025 15:20

timmoon10 reviewed Jan 8, 2025

View reviewed changes

timmoon10 self-requested a review January 8, 2025 21:57

phu0ngng reviewed Jan 10, 2025

View reviewed changes

transformer_engine/pytorch/triton/permutation.py Outdated Show resolved Hide resolved

transformer_engine/pytorch/triton/permutation.py Outdated Show resolved Hide resolved

hxbai and others added 5 commits January 16, 2025 04:56

fix func/var names and typos; update tols in UT

fca9406

Signed-off-by: Hongxiao Bai <[email protected]>

Merge branch 'main' into permute_fusion

dc7bbca

update copyright

b493a23

Signed-off-by: Hongxiao Bai <[email protected]>

update doc

2fae821

Signed-off-by: Hongxiao Bai <[email protected]>

minor fix in UT

2b337d9

Signed-off-by: Hongxiao Bai <[email protected]>

Merge branch 'main' into permute_fusion

b50363d

phu0ngng approved these changes Jan 21, 2025

View reviewed changes

timmoon10 approved these changes Jan 24, 2025

View reviewed changes

tests/pytorch/test_permutation.py Outdated Show resolved Hide resolved

tests/pytorch/test_permutation.py Outdated Show resolved Hide resolved

tests/pytorch/test_permutation.py Show resolved Hide resolved

minor fixes in the UT file

9db4f52

Signed-off-by: Hongxiao Bai <[email protected]>

phu0ngng merged commit 2fce82b into NVIDIA:main Jan 27, 2025
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MoE][PyTorch] Add mask-based MoE permutation #1373

[MoE][PyTorch] Add mask-based MoE permutation #1373

hxbai commented Dec 13, 2024 •

edited

Loading

yanring commented Jan 21, 2025

phu0ngng commented Jan 21, 2025

timmoon10 left a comment

[MoE][PyTorch] Add mask-based MoE permutation #1373

[MoE][PyTorch] Add mask-based MoE permutation #1373

Conversation

hxbai commented Dec 13, 2024 • edited Loading

Description

Type of change

Changes

Checklist:

yanring commented Jan 21, 2025

phu0ngng commented Jan 21, 2025

timmoon10 left a comment

Choose a reason for hiding this comment

hxbai commented Dec 13, 2024 •

edited

Loading